-
Notifications
You must be signed in to change notification settings - Fork 803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Stretch (hello-robot) #409
Conversation
2381002
to
4e531f0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass. Still reviewing
def find_camera_indices(raise_when_empty=True) -> list[int]: | ||
def find_cameras_info(raise_when_empty=True) -> dict[int, str]: | ||
""" | ||
Find the serial numbers of the Intel RealSense cameras | ||
Find the names and the serial numbers of the Intel RealSense cameras | ||
connected to the computer. | ||
""" | ||
camera_ids = [] | ||
cameras_info = {} | ||
for device in rs.context().query_devices(): | ||
serial_number = int(device.get_info(rs.camera_info(SERIAL_NUMBER_INDEX))) | ||
camera_ids.append(serial_number) | ||
name = device.get_info(rs.camera_info.name) | ||
cameras_info[serial_number] = name | ||
|
||
if raise_when_empty and len(camera_ids) == 0: | ||
if raise_when_empty and len(cameras_info) == 0: | ||
raise OSError( | ||
"Not a single camera was detected. Try re-plugging, or re-installing `librealsense` and its python wrapper `pyrealsense2`, or updating the firmware." | ||
) | ||
|
||
return camera_ids | ||
return cameras_info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not fan of modifying this function, to stay consistent with the one in opencv.py
Instead, could we have another function that given camera_indices / serial number retrieves the camera name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that we would still need to iterate through rs.context().query_devices()
in that function and compare device.get_info(rs.camera_info(SERIAL_NUMBER_INDEX))
against the given serial_number before accessing the name. We can do it but:
-
- Would makes things harder to read and understand why it's doing those convolutions
-
- Would be slower as you'd iterate twice on the devices
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What can we do to unify opencv and intelrealsense on this function?
I would imagine find_cameras_info(raise_when_empty=True) -> list[dict]:
for both, with camera_infos[0]["index"]
being the camera index port for opencv or intelrealsense serial, and camera_infos[0]["name"]
not existing for opencv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: camera_infos
instead of cameras_info
for consistency with camera_ids
and how we usually add a s
at the end for our variable names of type list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What can we do to unify opencv and intelrealsense on this function?
As we discussed, I think we should eventually converge to a single Camera
class with different backends (realsense, opencv...). This would probably simplify the design and avoid having to monkeypatch those classes in the tests. Then we could think about the design of this function (which would be a single one). Wdyt?
Nit: camera_infos instead of cameras_info for consistency with camera_ids and how we usually add a s at the end for our variable names of type list
I wondered about the exact same thing but the plural of "info" or "information" in English is without a s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
I agree, but I think we should homogenise opencv and intelrealsense at this level before merging. What do you think of my suggestion?
-
I know but we dont write in english, we write in python (which is an english dialect) so the english grammar rules dont apply the same way ^^
50aa25c
to
dee3b0a
Compare
Co-authored-by: Remi <remi.cadene@huggingface.co>
dee3b0a
to
88526d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @aliberts, looks great! I've only done a read-through so far. I can run it on actual hardware later this week. I don't want to delay your PR, so don't feel obliged to wait.
examples/8_use_stretch.md
Outdated
``` | ||
This is essentially the same as running `stretch_gamepad_teleop.py` | ||
|
||
Store your Hugging Face repository name in a variable to run these commands: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a huggingface-cli login
step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you're correct. Technically, it's already pointed out in the 7_get_started_with_real_robot.md as some other details and I didn't want to add to avoid too much redundancy. I've added a word on that (7c87025)
Eventually we might rearrange these markdown as the number of robots grow (cc @Cadene)
if camera_ids is None: | ||
camera_ids = find_camera_indices() | ||
if len(camera_ids) == 0: | ||
camera_ids = find_cameras_info() | ||
|
||
print("Connecting cameras") | ||
cameras = [] | ||
for cam_idx in camera_ids: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, the variable should be renamed to cameras_info
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, this was because it was copied from a very similar function for opencv cameras (see this discussion).
I'll wait for that discussion to resolve before changing it.
@@ -237,6 +248,36 @@ def __init__( | |||
self.depth_map = None | |||
self.logs = {} | |||
|
|||
# TODO(alibets): Do we keep original width/height or do we define them after rotation? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely nice to keep width/height consistent with the rotated image
self.is_connected = False | ||
self.teleop = None | ||
self.logs = {} | ||
# TODO(aliberts): remove original low-level logging from stretch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add
at the top of the file to raise the logging level. I might be able to also add a boolean to outright disable Stretch Body logging if it still interferes with your loggers. Let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried that in the past and it didn't work, but somehow since then the logs from Stretch have disappeared and I'm not exactly sure why 😅
I've added it e0fa5e4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's interesting 😂. I'll let you know if I see them when I run it.
if self.state_keys is None: | ||
self.state_keys = list(state) | ||
|
||
if not record_data: | ||
return | ||
|
||
state = torch.as_tensor(list(state.values())) | ||
action = torch.as_tensor(list(action.values())) | ||
|
||
# Capture images from cameras | ||
images = {} | ||
for name in self.cameras: | ||
before_camread_t = time.perf_counter() | ||
images[name] = self.cameras[name].async_read() | ||
images[name] = torch.from_numpy(images[name]) | ||
self.logs[f"read_camera_{name}_dt_s"] = self.cameras[name].logs["delta_timestamp_s"] | ||
self.logs[f"async_read_camera_{name}_dt_s"] = time.perf_counter() - before_camread_t | ||
|
||
# Populate output dictionnaries | ||
obs_dict, action_dict = {}, {} | ||
obs_dict["observation.state"] = state | ||
action_dict["action"] = action | ||
for name in self.cameras: | ||
obs_dict[f"observation.images.{name}"] = images[name] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lot of duplicate code. Could call self.get_observation()
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cadene I know we can't do that for send_action
but after looking at it, I don't see why we can't simply call self.get_observation()
in teleop_step
. I've seen your comment in manipulator.py
that get_observation
doesn't return a batch dim but so does teleop_step
.
Wdyt, am I missing something here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean self.capture_observation()
?
I am fine with this duplicated code for now. We have the exact same pattern in manipulator.py
. We will refactor at some point.
We can add a TODO(rcadene, aliberts): refactor with self.capture_observation()
|
||
def print_logs(self) -> None: | ||
... | ||
# TODO(aliberts): move robot-specific logs logic here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for log in self.logs:
self.logger.info(f"{log}: {self.logs[log]}")
i think
lerobot/configs/robot/stretch.yaml
Outdated
robot_type: stretch3 | ||
|
||
cameras: | ||
# TODO(aliberts): hardware currently not working, uncomment and test this when fixed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did this get resolved? This is specific to your robot, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replacement board is on the way ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad to hear it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, almost ready to merge
def print_logs(self): | ||
... | ||
# TODO(aliberts): move robot-specific logs logic here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we use pass
instead of ...
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done eb48452
def print_logs(self) -> None: | ||
... | ||
# TODO(aliberts): move robot-specific logs logic here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same pass
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done eb48452
def find_camera_indices(raise_when_empty=True) -> list[int]: | ||
def find_cameras_info(raise_when_empty=True) -> dict[int, str]: | ||
""" | ||
Find the serial numbers of the Intel RealSense cameras | ||
Find the names and the serial numbers of the Intel RealSense cameras | ||
connected to the computer. | ||
""" | ||
camera_ids = [] | ||
cameras_info = {} | ||
for device in rs.context().query_devices(): | ||
serial_number = int(device.get_info(rs.camera_info(SERIAL_NUMBER_INDEX))) | ||
camera_ids.append(serial_number) | ||
name = device.get_info(rs.camera_info.name) | ||
cameras_info[serial_number] = name | ||
|
||
if raise_when_empty and len(camera_ids) == 0: | ||
if raise_when_empty and len(cameras_info) == 0: | ||
raise OSError( | ||
"Not a single camera was detected. Try re-plugging, or re-installing `librealsense` and its python wrapper `pyrealsense2`, or updating the firmware." | ||
) | ||
|
||
return camera_ids | ||
return cameras_info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What can we do to unify opencv and intelrealsense on this function?
I would imagine find_cameras_info(raise_when_empty=True) -> list[dict]:
for both, with camera_infos[0]["index"]
being the camera index port for opencv or intelrealsense serial, and camera_infos[0]["name"]
not existing for opencv.
def find_camera_indices(raise_when_empty=True) -> list[int]: | ||
def find_cameras_info(raise_when_empty=True) -> dict[int, str]: | ||
""" | ||
Find the serial numbers of the Intel RealSense cameras | ||
Find the names and the serial numbers of the Intel RealSense cameras | ||
connected to the computer. | ||
""" | ||
camera_ids = [] | ||
cameras_info = {} | ||
for device in rs.context().query_devices(): | ||
serial_number = int(device.get_info(rs.camera_info(SERIAL_NUMBER_INDEX))) | ||
camera_ids.append(serial_number) | ||
name = device.get_info(rs.camera_info.name) | ||
cameras_info[serial_number] = name | ||
|
||
if raise_when_empty and len(camera_ids) == 0: | ||
if raise_when_empty and len(cameras_info) == 0: | ||
raise OSError( | ||
"Not a single camera was detected. Try re-plugging, or re-installing `librealsense` and its python wrapper `pyrealsense2`, or updating the firmware." | ||
) | ||
|
||
return camera_ids | ||
return cameras_info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: camera_infos
instead of cameras_info
for consistency with camera_ids
and how we usually add a s
at the end for our variable names of type list
Co-authored-by: Remi <remi.cadene@huggingface.co>
Co-authored-by: Remi <remi.cadene@huggingface.co> Co-authored-by: Remi Cadene <re.cadene@gmail.com>
What this does
Adds native support for Stretch 3
StretchRobot
and associated config for a standard Stretch 3examples/8_use_stretch.md
TODO in this PR:
currently notworking when streaming Stretch's GUI through MoonlightTODO in future PRs:
control_robot
in their respective classesobservation.images
shapes in datasets metadataHow it was tested
I tested calibration, teleoperation and recoding in the 3 setups (tethered, untethered and ssh). I also tested replaying an episode with the tethered setup.
How to checkout & try? (for the reviewer)
python lerobot/scripts/control_robot.py record \ --robot-path lerobot/configs/robot/stretch.yaml \ --fps 30 \ --root data \ --repo-id ${HF_USER}/stretch_test \ --tags stretch tutorial \ --warmup-time-s 3 \ --episode-time-s 40 \ --reset-time-s 10 \ --num-episodes 2 \ --push-to-hub 0
python lerobot/scripts/control_robot.py replay \ --robot-path lerobot/configs/robot/stretch.yaml \ --fps 30 \ --root data \ --repo-id ${HF_USER}/stretch_test \ --episode 0